CS 267 Final Report Reproducible Parallel Matrix-Vector Multiply

نویسنده

  • Peter Ahrens
چکیده

Parallel code can be difficult to verify due to inherently non-reproducible execution models. When debugging or writing tests, users could benefit from getting the same result on different runs of the simulation. This is the goal that the ReproBLAS project (Nguyen et al.) intends to achieve. ReproBLAS [3] has so far introduced a reproducible floating point type (indexed float) and associated algorithms underlying BLAS1 serial and parallel routines. A long-term goal of ReproBLAS is, for example, the implementation of a fullyfeatured reproducible PBLAS [6]. ReproBLAS has yet to define a formal interface for its underlying algorithms, and it is as yet unknown whether the existing low-level routines can be assembled together to form reasonably efficient higher-level routines. Here, we implement a reproducible matrix-vector multiply in order to gauge the feasibility of and identify challenges in building a more complex reproducible linear algebra library.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CS 267 Final Project: Parallel Robust PCA

Principal Component Analysis (PCA; Pearson, 1901) is a widely used method for data compression. The goal is to find the best low rank approximation of a given matrix, as judged by minimization of the `2 norm of the difference between the original matrix and the low rank approximation. However, the classical method is not resistant to corruption of individual input data points. Recently, a robus...

متن کامل

Charge-Mode Parallel Architecture for Vector–Matrix Multiplication

An internally analog, externally digital architecture for parallel vector–matrix multiplication is presented. A threetransistor unit cell combines a single-bit dynamic random-access memory and a charge injection device binary multiplier and analog accumulator. Digital multiplication of variable resolution is obtained with bit-serial inputs and bit-parallel storage of matrix elements, by combini...

متن کامل

CS 267 Final Project: A parallel incompressible two-phase flow solver for complex geometries

We present a fully 3D coupled level set projection method in arbitrary logically rectangular geometries for studying two phase immiscible incompressible flow in enclosed regions. We discuss our numerical methodology, parallelization of our algorithm, and current results.

متن کامل

A Library for Parallel Sparse Matrix Vector Multiplies

We provide parallel matrix-vector multiply routines for 1D and 2D partitioned sparse square and rectangular matrices. We clearly give pseudocodes that perform necessary initializations for parallel execution. We show how to maximize overlapping between communication and computation through the proper usage of compressed sparse row and compressed sparse column formats of the sparse matrices. We ...

متن کامل

Cs - Tr - 2007 - 002 Error Analysis of Various Forms of Floating Point Dot Products ∗

This paper discusses both the theoretical and statistical errors obtained by various dot product algorithms. A host of linear algebra methods derive their error behavior directly from dot product. In particular, most high performance dense systems derive their performance and error behavior overwhelmingly from matrix multiply, and matrix multiply’s error behavior is almost wholly attributable t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015